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ABSTRACT 

We present the open-source Python code pyMCZ that determines oxygen abundance and its distri¬ 
bution from strong emiss i on lin es in the standard metallicity calibrators, b ased on the original IDL 
code of Kewley & Dopita (2002 ) with updates from Kewley & Ellison (2008), and expanded to include 
more recently developed calibrators. The standard strong-line diagnostics have been used to estimate 
the oxygen abundance in the interstellar medium through various emission line ratios (referred to as 
indicators) in many areas of astrophysics, including galaxy evolution and supernova host galaxy stud¬ 
ies. We introduce a Python implementation of these methods that, through Monte Carlo sampling, 
better characterizes the statistical oxygen abundance confidence region including the effect due to 
the propagation of observational uncertainties. These uncertainties are likely to dominate the error 
budget in the case of distant galaxies, hosts of cosmic explosions. Given line flux measurements and 
their uncertainties, our code produces synthetic distributions for the oxygen abundance in up to 15 
metallicity calibrators simultaneously, as well as for E(B-V), and estimates their median values and 
their 68% confidence regions. We provide the option of outputting the full Monte Carlo distributions, 
and their Kernel Density estimates. We test our code on emission line measurements from a sample 
of nearby supernova host galaxies (z < 0.15) and compare our metallicity results with those from 
previous methods. We show that our metallicity estimates are consistent with previous methods but 
yield smaller statistical uncertainties. It should be noted that systematic uncertainties are not taken 
into account. We also offer visualization tools to assess the spread of the oxygen abundance in the 
different calibrators, as well as the shape of the estimated oxygen abundance distribution in each 
calibrator, and develop robust metrics for determining the appropriate Monte Carlo sample size. The 
code is open access and open source and can be found at https://github.com/nyusngroup/pyMCZ 

Subject headings: Galaxy: abundances — ISM: HII regions — supernovae: general 


1. INTRODUCTION 

Small amounts of carbon, oxygen, nitrogen, sulfur, iron 
and other elements provide a splash of color to the other¬ 
wise dominating grayscape of hydrogen and helium in the 
stars and gas of galaxies. Nevertheless, even this minute 
presence of heavy elements is imp ortant for many areas of 
astrophysics. For example, Johnson & Li (2012) suggest 
that if it were not for the relatively high metallicity level 
in our Solar System, planet formation may not have been 
possible. With Z representing the mass fraction of met¬ 
als, all elements heavier than He (also collectively called 
metallicity), for our own S un the value is measured to 
be Z = 0.0153 (|Caffau et al.||2011|), though there are 
suggestions of a lower Solar metallicity o f Z = 0.0134 
(Asplund et al.|2009 Grevesse et al. 


erly observed and estimated, metal 


2010). When prop- 


icity measurements 


of galaxies can tightly constrain models of galaxy for- 
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mation and evolution (e.g., |Kewley & Ellison 2008 and 
references therein), and shea light on the metallicity de¬ 
pendence and production conditions for different types of 
supernovae (S Ne) and long-durat i on gamma-ray bursts 
(GRBs) (e.g., iModjaz et al.||2008| iLevesque et al. 1120101 

*- al. Il M 


Anderson et al.||2010[ |Modjaz et al.||201H IKelly & Kir- 


shner 20121 Sanders et al. 20121 ILeloudas et al. 2014 


Lunnan et al.||2014 Pan et al.||2014|). 

Metals are produced in dhe cores of massive stars dur¬ 
ing their fusion life cycle but also during the extreme 
conditions of stellar explosions. For example, the ma¬ 
jority of iron is synthesized in thermonuclear explosions 
(SNe la) while nearly all oxygen and other a-elementsFl 
are released in various kinds of core-collapse SNe (SNe IF 
and stripped-envelope core-collapse SNe). 

However, for almost all astronomical objects, metal¬ 
licity cannot be measured directly. The oxygen abun¬ 
dance in the gas-phase is the canonical choice of metal¬ 
licity indicator for interstellar medium (ISM) studies. 
After Hydrogen, Oxygen lines are the strongest emis¬ 
sion lines in optical wavelengths. Oxygen is the most 
abundant metal and it is only weakly depleted onto dust 
grains, in contrast to refractory elements such as mag¬ 
nesium, silicon, or iron (iron for example is depleted b y 
more than a fa ctor of 10 in Orion; see Simon-Diaz & 


Stasiriska 2011). The oxygen abundance is expressed as 


7 a-elements are element with even atomic number lower than 
22, synthesized by a-capture 
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12+log 10 (O/H), where O and H represent the number of 
oxygen a nd h ydrog en atoms, respectively. For example: 
ICaffau et al.||20Tl measure a Solar oxy gen abundance 
of 12 + logmfO/Jf)= 8.76 ± 0.07, while |Asplund et~aL 
(|2009|) suggest 12 + log 10 (O/H)= 8.69p] While oxygen 
abundance is used to gauge metallicity, in many cases 
in the literature, including here, the terms metallicity 
and oxygen abundance are used interchangeably. Impor¬ 
tantly, oxygen exhibits very strong nebular lines in the 
optical wavelength range in spectra of HII regions (e.g ., 


Page! et al.|1979 Osterbrock|i989 Tremonti et al.|2604 ), 


which can be measured. Thus, many different diagnostic 


and other elements, have been developed (e.g., Kewley 

& Dopita||2002 

- hereafter KD02, Pettini & Pagel||2004 

- PP04,lfc>bu 

nicky & Kewley: 2004 KK04, Kewley 

& Ellison 

2008 

- KE08), and are discussed in the next 


section. 

Many fields rely on the determination of the metal¬ 
licity environments to understand physical phenomena, 
such as SN and planetary formation, from a causal, and 
possibly mechanicistic point of view. However many of 
these fields, including SN studies, have used metallicity 
calibration techniques somewhat acritically, at times in¬ 
ferring from calibrators that are not comparable, as they 
rely on different assumptions, because of the scarsity of 
complete dataset that would allow consistent estimates, 
and often ignoring many sources of uncertainty. 

The purpose of this paper is to present a public code 
that collects different abundance diagnostics to efficiently 
and rapidly compute metallicity from strong emission 
line fluxes as well as the associated statistical uncertain¬ 
ties due to the measured emission line flux uncertain¬ 
ties. This source of uncertainty is particularly relevant in 
the study of SN metallicity environments, since the host 
galaxies of these cosmics explosions are often at large 
distances. While in many other contexts the signal-to- 
noise (SNR) of the spectra themselves may contribute 
negligebly to the total uncertainty budget, compared for 
example to errors in the model parameters, it is often a 
substantial source of uncertainty for high Z galaxies that 
host SNe. The necessity to compute metallicity quickly, 
and systematically for large samples arises from the new 
availability of large samples, enabled by IMACS spec- 


MANGA( 

Bundy et al. 

2015) 

(Jur open-source Pyt 

non p 


This Python code allows the user to quickly produce 
metallicity values with sensible confidence regions for 
several metallicity calibrators at once, given an input 
set of spectral line measurements and their errors, and 
to obtain and visualize the distribution of metallicities. 
pyMCZ is structured in a modular way in order to allows 
the user to include other strong line metallicity calibra¬ 
tion methods, and link to external code packages, with 
minimal modifications, and naturally extend the advan¬ 
tages of the Monte Carlo (MC) based propagation of the 
observational uncertainties to the newly included calibra¬ 
tors. While we do not mean to advocate for a particular 
metallicity calibrator to be adopted, the comparison of 


8 The ongoing debate about the value of the Solar oxygen abun¬ 
dance should be kept in mind when metallicities are expressed rel¬ 
ative to solar metallicity. 


multiple calibrator outputs, and the shape of each metal¬ 
licity distribution, can guide the user in understanding 
the reliability of a metallicity estimate, given a set of line 
fluxes. Below we describe the different oxygen abundance 
diagnostics, and our Python module implementation. 


2. OXYGEN ABUNDANCE DIAGNOSTICS AND 
CALIBRATORS 

In this section we present a brief overview of the var¬ 
ious observational methods for measuring the gas-phase 
oxygen abundance - however, for a detailed discussion, 
and to understand the many caveats, we e ncou r age th e 


reader to read the reviews by, e.g., Stasinska (2002) 


KE08, M£](glg^sgtal. (2010J), Stasinska (2(JT0), 


opez- 


San chez et al. (|20I2 b Uopita et al. ] 2018 hereafter DI3 
and|Blanc et al.| (|20I5|). 

The so-called “classical” way to estimate oxygen abun¬ 
dances is the electron temperature ( T e ) method, which 
estimates the electron temperature and density of a neb¬ 
ula using a number of oxygen lines in different ionization 
states, including the auroral [O III] A4363 line, to then di¬ 
rectly estimate the O II and O III abundances and finally, 
after correcting for the unseen stages of ionization, to 
obtain the total oxygen abundance. However, except for 
low-metallicity environments, the auroral [O III] A4363 
line is very weak, a nd it sa turates at metallicities higher 
than solar (|Stasinska||2002|), since at higher metallicities 
the cooling is dominated by the oxygen fine structure 
lines in the near-infrared (NIR). In addition, there are 
other caveats about the T e -method, as fluctuations and 
gradients in temperature or in chemical composition may 
lead to u nderestimates o f the o xygen a b und a nce (see f or 
example|Peim bert|1967[ an d Lopcz-Sanchez et al.[2012). 
Most recently, Berg et al. (2UI5p suggested that among 
the auroral lines [O 11], [O 111] and [N II], the [O III] 
A4363 line, commonly used for T e measurements, is the 
most problematic one, giving rise to temperature discrep¬ 
ancies. Thus, other methods have been developed that 
estimate the oxygen abundance from ratios of strong neb¬ 
ular lines in the spectra of HII regions, including amongst 
others [O III] AA 4959, 5007, [O II] AA 3726, 3729, [N II] 
A 6584, [S II] AA 6717, 6731, as well as Ha and H/?. These 
are called strong-line calibrations and are the subject of 
this manuscript b] 

Strong-line methods can be categorized into three 
types, depending on how they calibrate the observed 
emission line ratios: 


• empirical methods, which calibrate li ne ratios on 
observed 


Pi! 


ugm 


& 


_T^-based metallici ties (e.g. 

Thuan|2005 - hereafter P 05; |Pilyugin et al.||20l() 
P10; |Pilyugin et al.|2012|- P12; |Marino et al.|2013| 
- M13J7^ 

theoretical methods, which rely on calibrating var¬ 
ious observed line ratios using stellar population 
and photoionization models (basically theore tically 
simulating HII regions, e.g.,|McGaugh|199T1- here¬ 


after M91, KD02, Tremonti et al. 12004 ] 


9 There is one more class of methods, for which the recombi- 
nation lines of different me tal ions are used (e.g., Stasiriska||2002| 
| Lopez-Sanchez et al.|[2012[ ). However, these lines are so weak (tor 
O and C they are ~ 10 —4 of H ( 3 ) that these methods can be used 
for HII regions in the Milky Way and in the Local Group, but not 
over large extragalactic distances, in which we are interested. 
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and lastly hybrid methods that use a mixture of 


Denicolo et al. 

2002 

hereafter D02, Pettini & 

Pigel 2004 

P 

P04; 

Maiolino et al. 2008 - M08; 

and 1 h-'rez-. 

Vlontero 2014 - PM14). 


The ionization parameter q needs to be taken into 
account in all calibrations, as HII regions at the same 
metallicity but with different ionization parameters pro¬ 
duce different line strengths. The importance of the ion¬ 
ization parameter, which can be physically understood 
as corresponding to the maximum velocity of an ionized 
front that can be driven by the local radiation held of hot 
massive stars that are ionizing the ISM gas, was dem on- 
strated in a number of works including |Evans fe D opita 
1985| jpopita fc Evans|1986| KD02, PM14, and |Sanchez 


et al.||2dT5 All calibrations include its impact, either 
implicitly (by assuming the input object has the same 
values as the calibration sample) or by explicitly solving 
for it. 

While historically there have been large systematic off¬ 
sets between the T e method and the strong line methods, 
partially due to real temperature gradients th at exist 
in the hi gh-ionizat ion zones of the HII regions (Lopez- 
Sanchez et al. 2012"), D13 demonstrated that the results 
of the l' e method and strong line methods are recon¬ 
ciled if the energy distribution of the electrons in the 
HII regions is not assumed to be a simple Maxwell- 
Boltzmann distributio n (as in prior works), but a more 
realistic n distribution (Vasyliunas 1968 Owocki & Scud- 
der||1983 ), as observed m Solar plasma. In addition they 
find that the effect of changing the specific n distribution 
on t he strong-lin e methods is minor (see also Mendoza 
& Bautista||2014|). 

There is along-standing debate about which diagnos¬ 
tic to use, and there are systematic metallicity offsets be¬ 
tween different methods (recombination lines vs. strong¬ 
line method vs. the “direct” T e method), however, the 
relative metallicity trends can be considered robust, if the 
analysis is performed self-consistently in the same cal¬ 
ibrator, an^Jrend^areseen^cross different calibrators 
(KD08, Moustakas et al.||2010 ). It is of course necessary 
to estimate uncertainties tor the relative comparisons to 
be meaningful. Note that while conversion value s be¬ 
tween different calibrators fKewley & Ellison 2008) have 
been estimated, they are average quantities derived from 
large data sets (tens of thousands of SDSS galaxies), and 
they should be used with caution, if at all, on individual 
metallicity measurements. 

3. COMMONLY USED METALLICITY INDICATORS 

Line ratios commonly used as metallicity indicators 
(used in PP04, P05, P12, and M13) are 


TV 2 = 


[N II] A6584 

Ho 


and th e Q3T V2 index, first introduced by Alloin 
et al.| (|1979|) and defined as 03/H/3/TV2 where 


03/H/3 = iu III]A4959+[0 III]A5007 ; thug . 

= [O III]A4959+[Q III]A5007 

H/3 [N II] A6584 

Since TV2 employs two closely spaced lines (Ha and 


Ha 


[N II] A6584), their ratio is not affected by stellar absorp¬ 
tion or (uncertain) reddening, and they are easily ob¬ 
served in one simple spectroscopic setup. Thus the PP04 
calibration based on TV2 has become a popular calibra¬ 
tor to use f or low-z SN host galaxy studies (see meta¬ 
analyses by |Sanders et al.||20l2} |Modjaz|[2012| |Leloudas] 
et al.|2014 and others). However, it is important to note 
that the PP04-N2 calibration has a number of short¬ 
comings: the PP04 calibration implicitly assumes the 
same relationship between the line ratios and the ioniza¬ 
tion parameter as in the calibration sample from which it 
was initially derived, which includes only 137 extragalac- 
tic HII regions, and it is a hybrid calibration method. 
An updated calibration by M13 is based on a sample of 
HII regions almost three times larger than that of PP04 
and, more importantly, only uses T e -based metallicities. 
M13 finds a significantly shallower slope in the relation¬ 
ship between the TV2 and 03TV2 indices and the oxygen 
abundance. Our code outputs M13 as a default, and 
PP04 if explicitly requested. It is important to note that 
the employed nitrogen emission line saturates at high 
metallicity, and thus the TV2 methods, including M13, fail 
at high-metallicity galaxies - at 12 + log 10 (O/T7) > 8.8 
(KD08). 

Additional strong line indicators include the N202 ra¬ 
tio: 

[TV II] A6584 


N202 = 


[O II] A3727 ’ 


introduced by KD02, which exploits the leverage offered 
by the large diatance in wavelength of the [TV II] A6584 
and [O II] A3727 lines, but is then very sensitive to red¬ 
dening and thus requires the measurements of Ha and 
H/3 to be available, and the Oxygen based 0302 indica¬ 
tor: 

[O III] A4959 + [O III] A5007 


0302 = 


[O II]A3727 


already introduced by M91, which has the obvious advan¬ 
tage of using only Oxygen lines, which are, as mentioned, 
the strongest emission lines in optical wavelengths apart 
from Hydrogen. 

The ratio o f o xygen line fluxes to H/3 is referred to as 
i ?23 (Pagel et al. 1979): 


T?23 = 


[O II]A3727 + [O III]AA4959,5007 

H/3 ’ 


where [O III] AA 4959,5007 stands for the sum of the two 
[O III] lines. This popular indicator is double-valued with 
metallicity: the same R 23 value may correspond to two 
metallicity values, and thus other indicators need to be 
used to break the degeneracy between the high (“upper 
branch”) and the low val ues (“lower branch”) of the R 23 
metallicities (e.g., KD08, |Moustakas et al.|2010[). A sim¬ 
ilarly double valued indicator is 1823 , which is based on 
the sulfure lines: S23 = ([S II] A6717+[S III] A9069)/H/3. 
For these double-values indicators the curvature of the 
relationship between the line ratio and the metallicity in 
the two branches depends on the ionization parameter, 
and both empirical and theoretical calibrators that use 
them have to solve for q. For the theoretical strong-line 
method, calibrations of R 23 by M91, KD02, and D13 
use different theoretical photoionization models: differ¬ 
ent stellar populations codes (e.g., Starburst 99, based on 
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Bruzual A. &; Chaiiot|[l993] or POPSTAR, Moll a et al 
20091) and p hotoionizatio n codes (e.g., MAPP INGS Allen 
et al.|2008 or CLOUDY, Ferland et al.|1998 ) to calibrate 
different line ratio indices. M91 and KDU2 use an itera¬ 
tive process to break the R 2 3 degeneracy and to constrain 
the ionization parameter q. Other calibrations, such as 
Z94, do not explicitly solve for q. On the empirical cal¬ 
ibrators side, P10 introduces the P parameter, which is 
a modified version of the ionization parameter q. 

As can be seen, each calibrator has different advan¬ 
tages and disadvantages and should be used in different 
metallicity regimes (see de tailed discussion in, for exam¬ 
ple ,jCD02 J B^i^^^002] KD0 8,lMoustakas et al.|20i0| 
Lopez-Sanchez et ai.|2012 D13, Blanc et al.|2O150. Thus, 
pyMCZ outputs the oxygen abundance in the mam metal¬ 
licity calibrators whenever the lines necessary to calcu¬ 
late the indicators used by each method are available. 
As default output values for 11 metallicity diagnostics 
are generated (version vl.3, Fall 2015). By “diagnostics” 
here we mean a suite metallicity calibrators described 
in a single paper (e.g. KD02), and some of these diag¬ 
nostics describe multiple calibrators based on different 
indicators: the KD02 diagnostic have four outputs and 
the PP04, P10, M13 , M08, and D13 diagnostic have 
multiple outputs as well. 

4. PYMCZ: A PYTHON MODULE FOR STRONG-LINE 
METALLICITY DIAGNOSTICS 

We developed a Python package, pyMCZ for the si¬ 
multaneous calculation of 12 + log 10 (O/H) in various 
calibrators, and its confidence region. We started with 
the iterative IDL code by KD02 hereafter referred to as 
IDLKD02, specifically a version updated in KE08. We 
translated the code i nto Python and added new, more 
recent calibrators (see |subsection 4.2|) and new features, 
of which the most important is the capability of obtain¬ 
ing uncertainties on the metallicity outputs via MC sam¬ 
pling. 


4.1. Availability and Technical details 

The source code is published under MIT-licens^j on 
GitHub{3 At this time the code is released under DO I: 
10.5281/zenodo.17880 and the most recent stable version 
is version 1.3: pyMCZ vl.3. Project details, step-wise tu¬ 
torials, and further information can be found in the mod¬ 
ule README^ Development is done in Linux and OS 
X. The package requires standard python packages, such 
as numpy, scipy, pylab, and additional features are en¬ 
abled if the packages astroml, skitlearn, and pyqz are 
installed, but these packages are not required. pyMCZ can 
be run as a standalone package from the local directory, 
or it can be installed (via python setup.py install). 

4.2. Input and Output of code 

The input of the code is a set of spectral emission line 
fluxes. We assume that the observed emission lines orig¬ 
inate in HII regions and are not due to non-thermal ex¬ 
citation by, for example, AGN, interstellar shocks from 

1(1 https: //github. com/nyusngroup/pyMCZ/blob/master/ 

LICENSE.txt 

1 1 https : //github. com/nyusngroup/pyMCZ 

12 https://github.com/nyusngroup/pyMCZ/blob/master/ 

README.md 


SNe, or stellar winds. Tests to exclude data contami¬ 
nated by such non-thermal sources should be executed 


cators by, e.g., Baldwin et al. ( 

1981), Kauffmann et al. 

(2003 

), 

Kewiey et al. (2006) 

, and 

Gid Fernandes et al. 

(2010 

i; 

banchez et al. ((2015 

). Furthermore, these lines 


should have all the correct flux calibration (at least cor¬ 
rect relative calibration) and should have a SNR of at 
least 3. The latter requirement is important for the suc¬ 
cess of the sampling technique. In our code, synthetic 
line flux measurements are drawn within a Gaussian dis¬ 
tribution with standard deviation equal to the measure¬ 
ment error, and centered on the measured flux value 


(subsection 4.3). Thus, a SNR >= 3 assures that fewer 
than ~ 1 % of the sampled fluxes fall below zero (and 
are thus invalid). The code checks if for any line the 
SNR >= 3 condition is not satisfied, and issues a warn¬ 
ing message. 

Emission line flux values are fed into our Python imple¬ 
mentation as in the original IDLKD02 code. The inputs 
are emission line flux values, and their uncertainties, for 


A 6300, [O II] A 3727, 
N II] A 6584, [S II] A 


the following lines: Ha, H/3, [O I 
[O III] A 4959, [O III] A 5007, 

6717, [S II] A 6731, [S III] A 9096 and [S III] A 9532. 
The latter two lines can be used to calculate the sulfur 
ratio S 23 , but are not often observed since they are in 
the NIR. Only o ne metallici ty calibrator based on S 23 , 
Dfaz & Perez-Montero 2000 (DP00), is implemented in 
the current version of the code. The line fluxes are to 
be stored in an ASCII file, and the measurement errors 
in a separate ASCII file (for consistency with the orig¬ 
inal IDLKS02 code; consult the README. mcj^l in the 
GitHub repository for details about the inputformat, 
and find example files in the repository). If the fluxes 
for the specified lines are not available, the entry should 
be set to NaN. The oxygen abundance will be calculated 
only for metallicity calibrators that use valid, non- NaN, 
line fluxes. If the line fluxes necessary for specific cali¬ 
brators are not provided, the output metallicities will de¬ 
fault to NaN. In absence of measurement uncertainties, 
the error should be set to 0.0 in the input ASCII file. 
If the errors in the measurements are not provided, the 
code will specify that it cannot create a measurement dis¬ 
tribution or determine a confidence interval, but it will 
calculate and output the nominal metallicity obtained 
from the input flux measurement. 

The inputted line fluxes are corrected for reddening by 
using the observed Balmer decrement, for which Ha and 
H/3 flux values need to be provided. We assume case B 
recombination, and thus the standar d value of 2.86 as the 
intrinsic Ha/H/3 ratio ( Osterbrock| 1989), and apply the 
standard Ga lactic reddening law with Ry = 3.1 (Cardelli 
et al. 1989). However, the user can choose other ex¬ 
tinction laws and Ry values, if desired, given the code’s 
open-source nature. If the input measurements are al¬ 
ready de-reddened, the user can disable the reddening 
correction. If either Ha or H/3 are not provided, the 
reddening correction cannot be implemented. The user 
is notified with a warning message and has the option 
to proceed with the calculations with uncorrected line 
fluxes. This option is enabled, and should of course only 


13 https://github.com/nyusngroup/pyMCZ/blob/master/ 

README.md 
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TABLE 1 

CALIBRATORS AND LINE RATIOS/lNDICATORS FROM WHICH THE CALIBRATORS ARE DERIVED 


Calibrator a 


Lines used for metallicity calculation b Support lines and calibrators c 


Reference 


M91 

Z94* * 

(DPOO*) 

(P01) 


R23, 0302 
R23 
S23 

R23 , 03/H/3, [O IIJA3727/H/3 


N2, N202 iMcGaughl ill 991|) 

IZaritsky et a.f.l (I I tJ4]i 

Diaz & Pcrcz-Iviontcro (2000) 
N2, N202 - |Pilyugin| plTTf - 


(C01-R23) 


R23, [O III]A5007/H/3, g "ff 5 7 0 2 0 7 7 


(C01-N2S2) 


[O II] A3727 [N II] A65S4 

K23 ’ [O III]A5007 ’ [S II] A6717 


Chariot & Longhetti 

o 

o 

<M 


Chariot & Longhetti 

O 

o 

<M 


D02* 

N 2 

- 


Denicolo et al. ( 

2002| 


KD02.N2O2* 

N202 

- 

E 

Cewley & Dopita 

(2UU2) 

(PP04_N2Ha*) 

N 2 

- 


Return & Eagei 

2UU4 


(PP04_O3N2*) 

N2, 03/H/3 

" 


EeTEini & Pagel 

2004 



P05 R23, 03/H/3, [O II]A3727/H/3 

KK04_R 2 3 R23 1 ?, 

KD02_comb d M91, KD02.N2O2, KD02_N2Ha, KD04.R23 


N2, N202 
N2, N202 
N2, N202 


|Kobulnicky & Kewleytif 


|i- , iivngiri iz, tnuan| 


Kewley & Ellison 


Kewiey & Ellison 


g 

2ffij.,: 

2003) 


(2008) 


M08.O3O2 


0302 


Maiolino et al. (20081) 


M08_N2Ha 

M08-R23 

(M08_O3Hb) 

(M08_O2Hb) 


(M08-O3N2) 


P10_ON 

P10_ONS 


N 2 

R23 

[O III] A5007/H/3 
[O IIJA3727/H/3 

[O III] A5007 
[N II] A6584 

N2, 03/H/3, [O IIJA3727/H/3 
[N II] A6584/H/3, 03/H/3 


M08-O3O2, M08_N2Ha 
M08_O3O2, M08_N2Ha 
M08_O3O2, M08_N2Ha 


M08-O3O2, M08_N2Ha 


Maiolino et al. 

2008 

Maiolino et al. 

20US 

Maiolino et al. 

(2003 

Maiolino et al. 

2003 


Maiolino et al. (2008 I 


Pilvugin et al. 

(2010 

Dilyugm et al. 

c 

c 


[O II] A3727/H/3, I s H]A6717+[S II]A6731 


M13.N2* 

M13-03N2* 

D13_N2S2_03S2 

D13_N2S2_03Hb 

D13_N2S2_0302 

D13_N202_03S2 

D13_N202_03Hb 

D13_N202_0302 

D13_N2Ha_03Hb 

D13_N2Ha_0302 


[N IIJA6584/H/3 
[N II] A6584/H/3, 03/H/3 

[N II] A6584 [O III] A5007 

[S II]A6717+[S II]A6731 ’ [S II]A6717+[S II]A6731 


[N II] A6584 

[S II] A6717+ [S II] A6731 ’ uo / n P 

[N II]A6584 mD9 

[S II]A6717+[S IIJA6731’ 

N2Q2 _ [O III]A5007 _ 

J S n ] A6717+ [ S IIJA6731 

N202, 03/H/3 

N202, 0302 

N2, 03/H/3 

N2, 0302 


Marino et al. 

(2013 

Marino et al. 

(2013 


Dopita et al. (2013) 


Dopita et al. (2013) 


Dopita et al. (2013 ) 


Dopita et al. (2013 i 


[Dopita et al.| (|2013[) 

|Dopita et al.| (|2013| 

[Dopita et al.| (|2013[) 

[Dopita et al.| (|2013| 


a Calibrators in parentheses are optional outputs of our pyMCZ. 
b The flux ratios used to calculate the metallicity. 

c Additional flux ratios used to estimate the metallicity regime, select the upper or lower branch in double valued 
metallicity solutions, initiate minimization, etc. 

d This combined calibrator uses the output of other metallicity calibrators in different regimes and outputs a “rec¬ 
ommended” value appropriate for that abundance regime. 

* These calibrators use fully linear calculations and the observational uncertainty could therefore be propagated 
analytically. 
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E(B-V) 

measurement 1 of 2 

median: 0.120 
16th Percentile: 0.102 
84th Percentile: 0.137 - 
MC sample size 2200 (2000) 
histogram rule: Knuth's rule 
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! 

o 

o 
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> 


exampledata 


0.4- 


0 . 2 - 


0.20 


8.70 



PP04 03N2 

measurement 1 of 2 

median: 8.920 

16th Percentile: 8.881 

84th Percentile: 8.976 - 

MC sample size 2190 (2000) 
histogram rule: Knuth's rule 


8.90 9.00 

12+log(0/H) 


9.10 


9.20 


KK04 R23 

measurement 1 of 2 

median: 9.059 
16th Percentile: 9.058 
84th Percentile: 9.060 - 
MC sample size 2190 (2000) 
listogram rule: Knuth's rule 



I ' 


9.05 


9.06 


9.06 9.06 

12+log(0/H) 


9.06 


9.06 



KD02comb 

measurement 1 of 2 

median: 8.811 
16th Percentile: 8.799 
84th Percentile: 8.822 - 
MC sample size 2200 (2000) 
histogram rule: Knuth's rule 


8.78 


80 8.82 
12+log(0/H) 


8.84 


8.86 


Fig. 1.— Metallicity and reddening E(B-V) parameter distri butions based on the example d ata shown in | Table 2| emission line data 
of the HII regions at the position of SN 2008D, published in |Modjaz et a l. 2011 (see |section 5| for a discussion ol the test sample). The 
distributions are generated from iV=2,000 samples. The median values are shown with dashed whit e lines, while the confi dence region, 
between the 16 th and the 84 th percentile is shaded in orange. We show the metallicity calibra tors from Pettini &; Pagel (2004), using [O III] 



aputed metallicrty 


wmuutcu mcucmiv.iuy canuic iuuio. wc nuuo ± ± yj-± canuicitwio aic an optional outputs ot our code, since they are superseded by M13 (s cc 
|section ~2 \ . However, we use them for comparison with the results obtained injModjaz et al.|2011| Each plot indicates the calibrator, the 
sequential number of measurement in the input file (here “measurement 1 of 2", winch corresponds to the first line of the input ASCII 
files), the median, 16 th and 84 th percentile values, the sample size (which is by default initially set to 10% larger than the requested N 
value, but can be smaller if some sim ulations lead to inval id metallicities), and, finally, the method used to choose the bin size for the 
histogram (Kuth’s rule in this case, see [subsubsection 4.3.1 [ ). 
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be chosen, for cases in which the spectral fluxes are al¬ 
ready dereddened before being fed is input to pyMCZ. 

We implement the calculation of 12 + log 10 (O/H) val¬ 
ues and their uncertainties from the m etallicity calibra- 
tors listed below, and summarized in |Tablc l[ imple¬ 
mented as prescribed in KE08 where not otherwise noted. 
We emphasize that subsets of these calibrators use the 
same input line ratios, and that related assumptions and 
similar observation and simulations are used to generate 
the calibrator algorithms (e.g. there are several calibra¬ 
tors that are essential linear or near-linear functions of 
N2: PP04_N2, KK04.N2 and M13_N2). Thus the oxygen 
abundance results produced using different calibrator are 
far from independent. 


• M91 (McGaugh 1991) is an i? 23 ~based theoreti¬ 
cal calibrator which also determines the ionization 
parameter. To break the R 23 degeneracy we use 
N202 , following KE08 Appendix 2. 


• Z94 (Zaritsky et al. 1994) which is valid for the 
upper branch of -K 23 only, and we conservatively 
constrain it to log(i? 23 ) < 0.9, the range that is 
covered by the photoionization model grids. 


D02 ( Denicolo et al.|2002 ) for which we include, in 
addition to the uncertainties in the measurements, 
the uncertainty in the fit parameters as they ap¬ 
peared in the original paper. 


P05 (Pilyugin & Thuan 
method calibrated via T t 


2005) an i? 2 3 -based 
met alii cities on a sam¬ 


ple of HII regions. We use the values of N202, 
and N2 to discriminate between upper and lower 
branch. 


KD02 &: KK04: 4 calibrators: KD02JM2Q2, 
which uses the N202 indicator 


pita 

2002|), KK04JN2Ha which uses the N2 in- 

dicator (Kobulnicky & Kewley 

2004 

), KK04_i? 23 

(Kewley & Ellison 2008 appendix A2.2), which 


method, KD02_comb that chooses the optimal 
method given the input line fluxes an d is imp le¬ 
mented as des cribed in Appendix 2.3 of |Kewley fe| 


Ellison (2008). 


• M08 (Maiolino et al. 20081: This calibrator is 
a combination of the KD02 photoionization mod¬ 
els at high metallicities and T e based metallicities 
at low metallicities. Our default output includes 
their strong line diagnostic, which is based on R 23 , 
and the diagnostics based on 0302 and [N II]/Ha, 
since the metallicity estimates from [N II]/Ha or 
0302 are necessary to resolve the degeneracy in 
the double-valued R 23 metallicity. The other cal¬ 
ibrators (based on [O II]/H/?, [O IIIJ/H/3, and 
[O III] / [N II]) can be outputted upon explicit user 
request, via the command line options. 


P10 (Pilyugin et al. 2010): This is the so-called 


ONS diagnostic (involving the [O II], [O III], [N II] 
and [S II] lines) and is calibrated with HII regions 
that have T e based metallicities. 


• M13 (Marino et al. 
tors: M13 JN2, which 


2013): Two 

linear fit 


is a 


calibra- 
to the 


[N II] A6584/Ha, and M13_03N2. This is an up¬ 
dated calibration of the PP04 03N2 method, based 
on a large number of T e -based metallicity measure¬ 
ments, including those from the Calar Alto Legacy 
Integral Field spect roscopy Area survey (CALIFA, 
|Sanchez et al.|20T2] almost three times larger than 
the sample used in PP04). This method derives 
a significantly shallower slope between the 03N2 
index and the oxygen abundance than the PP04 
calibration did. 


• D13: (Dopita et al. 2013). The photoionization 
models used in KD02 ana in KK04 are updated 
by including new atomic data within a modified 
photoionization code and by assuming a k distri¬ 
bution for the energy of the electrons in the HII re¬ 
gions, rather than the simple Maxwell-Boltzmann 
distribution assumed in prior works. This distribu¬ 
tion is more realistic, as observed in Solar plasma 


(Nicholls et al. 20121. If the user has installed the 
publicly available pyq#’] Python module, [N II], 
[S II], [O III], Ha, ancHH/3 lines are fed to the 
pyqz module, which produces up to eight emission 
line ratio diagnostics for 12 + log 10 (O/I4), each us¬ 
ing two of the indicators [N II]/[S II], [N II]/Ha, 
[O III]/[S II], and [O IIIJ/H/L Our code sets 
the k parameter to 20, which is the value that 
D13 found best resolves the inconsistencies between 
oxygen abundance values derived from the strong¬ 
line methods and from the “direct” T e methocp] 



in the current version of the code. P01 is super¬ 
seded by P05. C01 produces a diagnostic based 
on R 23 , C01J? 23 , and one based on [N II]/[S II], 
C01_N2S2; C01_I?23 was used to calculate the com¬ 
bined calibrator described in KK04, and included 
in the IDLKD02 code, but it is no longer used in 
the new combined calibrator KD02_comb, which 
supersedes the old one. Thus, P01 and C01 are 


only incl u ded f or historical reasons. PP04: (Pettini 
& Pagel 2004| includes two calibrators PP04JN2, 
based on the [N II]/Ha ratio, and PP04_O3N2, 

based on (). calibrators are commonly 
used in the literat ure investigating SN and GRB 
environments (see Modjaz et al. 2011 and refer¬ 
enced therein) They ar e superseded by the M13 
calibrators (see section 2), but are included here for 
completeness. These five calibrators are not part 
of the default output of our code, however they are 


14 https://datacommons.anu.edu.au:8443/DataCommons/item/ 
anudc:5037 

±0 The user can modify the value of k, by editing the code, if they 
wish. The user may then need to generate the appropriate diagnos¬ 
tic grids, see pyqz s instructions http: //f pavogt. github. io/pyqz/ 
for further details. The current version of pyMCZ is compatible with 
versions 0.5.0 through 0.7.2 (current version at the time of publish¬ 
ing this work) of pyqz. 



















































































still available upon explicit user request via com¬ 
mand line input. 

The following diagnostics were calculated in the origi¬ 
nal IDLKD02 code, and are discussed in detail in KD02 


and Kewley & Ellison (2008): C01, P01, M91, Z94, D02, 
PP04, P05, KD02, KK04, and KD02_comb. We refer 
the reader to those papers for further details. DP00 is 
the only diagnostic that relies on sulfur ratios £>23 • The 
shortcomings of S 23 as a tool to measure abundances are 
discussed in KD02. The DP00 diagnostic we implement 
is modified with the addition of a term oc (c + 3 ) — 1 
as suggested by KD02, which corrects the tendency of 
the calibrator to systematically underestimate the abun¬ 
dance, with a discrepancy growing larger at higher metal- 
licity. However, we point out that the scatter in metal- 
licity derived from this diagnostic compared to others 
remains high. 

The distributions of E(B-V) and log(i? 23 ) values are 
also part of the default output. Certain parameters, such 
as the ionization parameter q and the electron density 
(using the [S II] lines) are computed as long as the nec¬ 
essary lines are provided, but are not outputted in the 
current version of our code — however, since the code 
is open-source, the reader can easily modify the code to 
suit their needs. 

4.3. Computing Statistical Uncertainties 

Three kinds of uncertainties must be taken into ac¬ 
count when calculating metallicities: 1) the propagation 
of measurements errors of the line fluxes; 2) the prop¬ 
agation of uncertainty in the calibration (either from 
model-fitting or propagating the uncertainties of the co¬ 
efficients in the linear calibrations); 3) the systematic 
offsets/errors between different calibrations with respect 
to the “true” value of the metallicity. 

In our work we account for errors due to the emission 
line measurement uncertainties, and, when provided by 
the authors, to the calibration uncertainties, and propa¬ 
gate them into the derived metallicities. The propagation 
of the observational errors cannot, in most cases, be con¬ 
ducted analytically, since many metallicity calculations 
rely on, for example, decision trees. Thus the MC envi¬ 
ronment is necessary in most cases to properly assess the 
effect of the measurement uncertainties, and, for consis¬ 
tency and to generate a simple, modular code, we adopt 
it as the method for uncertainty calculation in all cases. 
We provide the full metallicity distributions, for all cal¬ 
culated calibrators. The latter is important since, as we 
show, metallicity distributions are usually not symmetric 
and can be multimodal (for example for the i ? 2 3 based 
calibrators). In addition, we provide the full metallic¬ 
ity parameter distribution to the user. We note that we 
do not address the third source of uncertainty: any sys¬ 
tematic errors between different calibrators. This is the 


subject, for example, of Lopez-Sanchez et al. (2012). 

For every set of input line measurements we intro¬ 
duce a MC sampling method to obtain iterations via 
random sampling within the measurement uncertainties, 
and thus we obtain a robust result for statistical error 


estimation (e.g., |Efron 

1979 

Hastie et al. 

05 

O 

O 

CM 

Andrae 

2010 

!. Given a dataset with 

error bars trom which cer- 


tain parameters are estimated, MC sampling generates 
synthetic data samples from a given distribution. We 


draw synthetic data from a Gaussian distribution cen¬ 
tered on each measured line flux value, with standard 
deviation corresponding to the measurement uncertainty. 
The implicit assumption is made that the line flux error 
is Gaussian distributed in nature, and that the errors are 
independent. However, the user who wishes to provide 
their own probability distribution for the emission line 
uncertainties can easily modify the code to include the 
desired distribution. For each metallicity calibrator, and 
for each of the TV values chosen randomly within the rel¬ 
evant emission line distributions, we run the calculations 
that compute the metallicity. By generating synthetic 
data, this method effectively simulates conducting mul¬ 
tiple experiments when repeating observations is imprac¬ 
tical or impossible, as is the case of the emission line flux 
data. 

The sample size TV is set by the user, and one should 
expect an appropriate value of TV to be a few 1000s, de¬ 
pending on the metallicity calibrator chosen and mea¬ 
surement errors (for example: TV = 2,000 is deter mined 
to be suffic ient for our example data, as shown in sub¬ 
section 4.4). 

A distribution of parameter estimates for the oxygen 
abundance is generated, from which the median metal¬ 
licity and its confidence region are calculated, and the re¬ 
sults arebhmedandvisualized in an outputted histogram 
(see |subsubsection 4.3.1). This is done for each calibra¬ 
tor the user chooses to calculate. The fiftieth (50%) per¬ 
centile, i.e. the median, is reported as the estimated 
metallicity value, and the 16 th and 84 th percentiles of the 
metallicity estimate distribution, as its confidence region. 
The user can also choose to output the full metallicity 
distribution as binanp~^1 or ASCII files. 

This MC sampling approach takes into account the 
impact of the uncertain reddening (due to the uncer¬ 
tainties in the measurement of the Ha and H/5 fluxes), 
when the option for de-reddened metallicities is chosen. 
For each synthetic set of measurements, a new redden¬ 
ing value is calculated based on the sampled Ha and 
H/5 fluxes, and used to compute the de-reddened metal¬ 
licity value, so that the derived distribution of metallic¬ 
ity values naturally takes into account the uncertainty 
in reddening. The median value, and confidence inter¬ 
vals for E(B-V), as well as a distribution histogram, are 
outputt ed togethe r with the metallicity calibrators (first 
panel in Figure 1|). If either Ha or H/T are not provided, 
however, no reddening correction can be applied. The 
computed metallicity will not be reddening-corrected and 
t he E(B-V) output will be set to zero. 

|Figure l] shows metallicity estimate distributions 
for three representative calibrators (PP04_O3N2, 
KK04_i? 2 3, and KD02_comb), and for the reddening 
parameter E(B-V). Similar plots are produced for all 
calibrators calculated and for log(i? 23 ) as a default 
output (and are available online in the pyMCZ GitHub 
repositor jQ) . Although the input line flux distributions 
are Gaussian, the metallicity distributions are not, for 
two reasons: first, since the metallicities are computed 
based on log values of line flux ratios, symmetric error 
bars in linear space will translate into asymmetric error 
bars in log space; and second, some metallicity calibrator 

16 using the pickle Python module 

1 ' https://github.com/nyusngroup/pyMCZ 
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computations are non-linear, and sometimes bimodal, 
especially those that use R 23 and S 23 , as shown in 
Figure 2| since the upper or lower branch metallicity 
value has to be chosen to break the degeneracy for each 
synthetic measurement. 



12+log(0/H) 

Fig. 2.— Metallicity distribution according to the KK04 R 23 
calibra tor for the site of SN 2007rw, as measured in |Modjaz et al.~] 
I 20fT|. The R 23 based calibrators are double valued: tor the same 
H 23 value there are two metallicity solutions. The KK04 ft 23 cal¬ 
ibrator uses the ionization parameter q in an iterative fashion to 
determine the metallicity by selecting either the upper or lower 
branch value. I 11 some cases, when the errors in the measurements 
are large, or for particular flux ratios, the solution may oscillate be¬ 
tween the upper and lower branch in different realizations, giving 
rise to a bimodal metallicity distribution. These cases are easily 
identifiable by looking at the visual tools the code generates, such 
as this histogram. 


Since the metallicity distributions are not Gaussian, 
the percentiles we report cannot be expressed in terms of 
er values. In determining the confidence region for asym¬ 
metric and multi-m odal di stributions, there are broadly 
three approaches (Andrae 2010): choosing a symmetric 
interval, the shortest interval, or a central interval. With 
the central method, we determine the confidence inter¬ 
val by choosing the left and right boundaries such that 
the region outside the confidence interval on each side 
contains 16% of the total distribution — in analogy to 
the one-sigma-interval of a Gaussian distribution. This 
ensures that the algorithm finds the proper boundaries 
even for asymmetric, non-G aussian d istributions, and in 
the case of multiple peaks (Figure 2). In summary, the 
output for the measured value corresponds to the fiftieth 
(50%) percentile (median), while the lower error bar cor¬ 
responds to the 50 th -16 th percentile and the upper error 
bar corresponds to 84 t/l -50* ,i of the metallicity estimate 
distribution. 

The distributions for the D02 calibrator include the un¬ 
certainty in the fit parameters: the oxygen abundance in 
this calibrator is generated as 12 ± log ln (Q/g) = 9.12 ± 
0.05 + (0.73 ± 0.10) x [N II]/Ha ( penicolo et aTpOO^ . 
The parameters of the fit are generated as the sum of 
the nominal parameters (9.12 and 0.73) and a Gaussian 
distributed random value centered on zero, and within 
a standard deviation of 0.05 and 0.10, respectively, in 
the above units. Similarly, the distributions for the 
M13 calibrators include the uncertainty in the fit pa¬ 


rameters as stated in Marino et al. (2013): the oxygen 
abundance as a function of [JN HJ/Ha is parametrized 
as 12 + log 10 (O/H) = 8.743 ± 0.027 + (0.462 ± 

0.024) x N 11/Hct, and as a function of ^ iq/mf as ^ + 
log 10 (O/H) = 8.533 ±0.012±(0.214 ±0.012) x 
We note again that our code does not output the sys¬ 
tematic uncertainty of each calibrator, which, for exam¬ 
ple, is ^0.15 dex for KD02. However, if all metallicity 
measurements are in the same calibrator and only rela¬ 
tive comparisons are made, as recommended by a number 
of authors, then the systematic error has no impact. 


4.3.1. Visual diagnostics to interpret the metallicity outputs 

In order for the user to check the validity of a measure¬ 
ment, and to better understand the distribution of metal¬ 
licity values, we provide two visualizations: for each set 
of input line fluxes, we generate a histogram of the out- 
put distri bution in all metallicity calibrators calculated 
(|Figure l[[2j and [3j), and for each set of input line fluxes 
we generate a box-and-whiskers plot (hereafter boxplot , 
for short ) summar izing the result of all calibrators cal¬ 
culated (Figure 4). All the pl ots generat ed are created 
with Python matplotlitj^ ( Hunter| |2007). 

Choosi ng the binnin g size for a histogra m is n ot a triv¬ 
ial task. |Hogg| ([2008|) , and |Ivezic et al.| (|2014D , among 
others, describes various data analysis recipes tor select¬ 
ing a histogram bin size. Too many bins will result in 
an “over-fit” histogram (showing structure that is in fact 
not in the distribution, thus perhaps decreasing the con¬ 
fidence in the metallicity measurement), while too few 
bins may miss features of the distribution (with the op¬ 
posite effect). We provide several methods to generate 
the histograms, in order to enable the best possible vi¬ 
sualization, given the computational power, for visual 
interpretation. However, we emphasize that our metal¬ 
licity and confidence interval calculation is robust to this 
choice, since it calculates the percentiles of the unbinned 
distribution. 

By default, we use Knuth’s method to choose the num¬ 
ber of bins Wins for each histogram. Knuth’s method 
optim izes a Bayesia n fitness function across fixed-width 
bins (Knuth 2006). Additionally, we enable a num¬ 
ber of binning options from which the user can choose, 
including: the square root of the number of po ints, 
Wins = VN , Rice rule (Wins = 2\fN, e.g., Hastie 
et al. 2009), Doane’s formula (Wins = 1 + log 2 iV + 

log 2 (1 + Kurt y/(N/ 6)), where Kurt is the third mo¬ 


ment of the distribution, 


Doane 

197f 

L9 

), and the full 

s Bayesian B 

ocks, which opti- 


mizes a fitness function across an arbitrary configu ratio n 


of bins, su ch that the bins are of variable size (Scargle 


et al.|[2013). The implementation of the latter method 
Ipll 
n JVi 


requires the astroMIr" ! Python package to be installed 


on the user’s system (Vanderplas et al. 2012). If the 
astroML package is not found) the code will default to 


18 http://matplotlib.org/ 

19 [L)oane| (1976J attempted to address the issue of finding the 
proper number of bins for the histogram of a skewed distribution. 
Several versions of the so-called Doane’s formul a exist in the liter¬ 
ature. The formula we implement is found in|Bonate|2011| 

20 https://github.com/astroML/astroML 
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TABLE 2 

Example Data and their Uncertainties based on Data in|Modjaz et al~ {;2011| 


site a [O II] A3727 

H p 

[O III] A4959 

[O III] A5007 

[O I] A6300 

Ha 

[N II] A6584 

[S II] A6717 

[S II] A6731 

08D 1.842 (.053) 
06fo 2.875 (.101) 

0.958 (.032) 
1.251 (.044) 

NaN 

0.168 (.028) 

0.302 (.029) 
0.064 (.025) 

0.127 (.021) 
NaN 

4.746 (.026) 
4.026 (.069) 

1.642 (.022) 
0.781 (.033) 

0.941 (.021) 
0.821 (.035) 

0.543 (.019) 
0.573 (.032) 

a These spectra were collect 

ed at the site ol 

f SN explosions: SN 

2008D (line 1) and 

SN 2006fo (line 2) 






Knuth’s method. As mentioned, Knuth’s method implies 
an optimization. In cases in which the convergence takes 
too long, or if the number of bins after the minimization 
is ./Vbins/ a/A” > 5 or Nbi ns /\/N < 1/3, the code will re¬ 
vert to Rice rule. Some methods may be computationally 
prohibitive with a very large sample size, or very little 
computational power, such as the Bayesian methods that 
rely on optimization (Knuth’s method and the Bayesian 
Block method) in which case it may be convenient for a 
user to choose Doane’s formula, Rice rule, or even the 
square root of the number of samples. 



12+log(0/H) 


Fig. 3.— The Kernel Density for the distribution of values for 
the K D02comb calibrator for the first measurement of our example 
data || Table 2). The Kernel Density is displayed as a blue shaded 
region, and is calculated via KD Tree with a Gaussian kernel with 
b andwidth given by Si lverman’s rule, and normalized, as described 
in |subsubscction 4.3. 1| The histogram of the distribution with bin 
size chosen according to Knuth’s method is also plotted (gray bins) 
and the m edian and confidence intervals are shown as described in 
|Figurc 1] 


Lastly, the user can generate and visualize the metallic- 
ity distribution Kernel Density if the sklearrrd package 
is installed. Kernel Density Estimation (KDE) allevi¬ 
ates the problem of choosing the bin size, at the cost 
of specifying a convolution kernel. The Kernel Density 
of the distribution is here calculated via KD Tree with 
a gaussian kernel, as explained in the sklearn package 
documentation^ The bandwidth of the kernel is chosen 
according to Silverman’s rule (Silverman 1986) The 
results will then show both a histogram, with iVbi ns cho- 


21 http://seikit-learn.org/stable/ 

22 http://seikit-learn.org/stable/modules/density.html 

23 By Silverman’s rule the bandwidth for the KDE kernel is set 
tow = 1.06cr N~ 0 ' 2 , where a is the sample standard deviation. 
Although the bandwidth chosen accordingly to Silverman’s rule 
is only optimal in the case of a Gaussian basis and a Gaussian 
distribution, and our distributions are explicitly not Gaussian, as 


sen via Knuth’s method, as well as the distribution Ker¬ 
nel Density, as shown in |Figure 3} The KDE is saved 
as a binary python object (an sklearn KernelDensity 
object), so that it can be recovered outside the code and 
used as a probability distribution for further inference. 

The histograms are always normalized so that the high¬ 
est bin value is 1, and the Kernel Density is normalized 
to contain the same area as the overplotted histogr am. 


At each run the code also generates a boxplot (Fig¬ 


ure 4) that summarizes the result from each calibra¬ 
tor the user chooses to calculate. For each calibra¬ 
tor, the median of the 12 + log 10 (O/H) distribution 
is plotted as a black horizontal line. The height of the 
corresponding box represents the 25 th percentile of the 
12 + log 10 (O/H) distribution, known as the interquartile 
range (IQR). The bars represent the maximum and min¬ 
imum value of the distribution, excluding outliers. The 
outliers are plotted as circles, and are defined as all data 
points farther than 1.5 x IQR from the 25 th percentile 
(i.e. from either end of the box). 

The Solar oxygen abundance is indicated in this 
plot for comparison: a gray box shows a range of 
estimated values for S olar oxygen abundan ces, from 

to 12 T 
iagnostics 


2009) 
die di 


12 + log in (Q/j?)= 8,69 (Asplund et al. 
log 10 (O/iL)=8.76 (|Caffau et al.|j201l|). 
reques ted by the user have a slot in the plot (in the exam¬ 
ple in |Figure 3| the computed calibrators are the PP04, 
the M13, and the KD02 calibrators), as do diagnostics 
that are computed in support of other requested diag¬ 
nostics (M91 in this case, which is needed to calculate 
KD02_comb). However a slot exists in the plot whether 
the diagnostic can be produced or not, i.e. if the set 
of input lines does not allow a requested calibrator to 
be calculated an empty column will be generated in this 
plot. 

With this plot, the user can immediately check for con¬ 
sistency or scatter in the metallicities derived by the re¬ 
quested calibrators (e.g., there are well-known system¬ 
atic offsets between different methods, where the T e - 
based, empirical methods usually give lower values than 
the photoionization based ones), and coarsely assess the 
shape of the distribution in each calibrator (e.g. strong 
asymmetry or bimodality would show up as an asym¬ 
metric box, or a very asymmetric distribution of out¬ 
liers). Although, as we stated earlier, relative comparison 
within a metallicity calibrator are valid despite the typ¬ 
ical spread in metallicity derived by different strong line 
methods, a particularly large spread may indicate dif¬ 
ficulties in assessing metallicity for this spectrum. The 
user is responsible for understanding the strengths and 
caveats of the various diagnostics and in which ranges 
and conditions they may be used. The visualization tools 
we provide are meant to help the user understand the 


explained earlier, this kernel parametrization generally provides 
good results for our metallicity distributions. 
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Fig. 4.— A box-and-whiskers plot showing the comparison of the 
results of nine metallicity calculations, corresponding to five cali- 
brato rs as liste d above, calculated from the same set of measured 
lines ||Table 2| host galaxy of SN 2008D). For each calibrator, the 
median oi tile resulting distribution is plotted as a horizontal line, 
the interquartile range (IQR) is represented as an orange box, and 
the bars, joined to each end of the box by a dashed line, represent 
the minimum and maximum of the distribution excluding outliers, 
where outliers are defined as any point farther than 1.5 X IQR from 
the edges for the IQR. A range of values for Solar oxygen abun¬ 
dance that are commonly found in the literature is shown as a gray 
horizontal band. 


reliability of the strong-line metallicity derived for a par¬ 
ticular HII region. 


4.4. Visual diagnostic to assure completeness of the MC 

simulation 

The user chooses N, the number of samples to be gen¬ 
erated. The sample size is automatically increased by 
10% at the beginning of the run, in order to assure that 
even if during the calculations some of the output metal- 
licities were to result in non-valid values (iVafV’s or in¬ 
finities, for example, if divisions by very small numbers 
are required) the actual sample size is at least as large as 


the user intended. The code rarely produces non-valid 
values, and if the size of the valid output distribution 
were in the end smaller than the requested value N (if 
the number of non-valid outputs is larger than O.liV), 
the user should worry about the reliability of the set of 
input line fluxes used. 

The reliability of the metallicity estimates depends cru¬ 
cially on the sample size being sufficiently large to prop¬ 
erly characterize the distribution of metallicity values. It 
is, however, not trivial to decide when N is sufficiently 
large. As soon as N is large enough, and the distribution 
is well characterized, increasing the synthetic sample size 
will not change its shape. 

Let us consider a cumulative distribution for a metal¬ 
licity calibrator: D02, for example, which has noise from 
the measurement errors, as well as from the error in the 
fit parameters, or KD02, which uses a non-linear combi¬ 
nation of the input line flux values. We generate three 
distributions with N = 200, N = 2, 000, and N = 20, 000 
for both D02 and KD02comb. For each of these distribu¬ 
tions we randomly select four subsamples of size O.liV, 
0.25 N, 0.5 N, and 0.757V, and we overplot the cumulative 
distribution of each subsample to that of the full sam¬ 
ple. If all subsamples, including the smallest one (O.liV) 
constituted a sufficiently large MC sample to properly 
characterize the underlying distribution, all these cumu¬ 
lative distribution would typically appear smooth and, 
mos t importan tly, they would overlap. 

In | Figure 5| we show these cumulative plots for a dis¬ 
tribution generated with N = 200, N = 2,000, and 
N = 20,000 for D02 and KD02comb. The cumula¬ 
tive distributions for the subs ets of the N = 200 sam¬ 
ples simulation (left panel in Figure 5) are noisy and 
rather different from each other, indicating that 200 data 
points are not a sufficiently large dataset. Conversely, at 
N = 20, 000 (right panel) the distributions are indistin¬ 
guishable, indicating that a sample of size N = 2, 000 
(=0.1iV) is already large enough to fully describe the 
distribution. The N = 2, 000 cumulative distributions 
rapidly converge as the subsample size increases, indicat¬ 
ing that a value of N between 200 and 2,000 samples will 
characterize the distribution appropriately for our exam¬ 
ple dataset. In light of this we choose, conservatively, 
N = 2,000 to run our simulations for this dataset. 

The appropriate number N of samples will depend on 
the calibrator calculated and on the input data. The er¬ 
rors on the measurements and the calculations performed 
to derive the metallicity from line ratios, which for many 
calibrators are non-linear, will determine the shape of the 
distribution, and thus the number of data points that are 
needed to fully characterize it. Consult the repository 
READMEpq for instructions on how to generate these 
diagnostic plots. 


4.5. Performance and Benchmarks 

We ran a benchmark calculation on a MacBook Air 
with a dual-core Intel Core i7 (1.7 GHz) and 8GB of 
1600 MHz DDR3 memory. The dataset we used for the 
calculation includes two sets of line measurements (mea¬ 
surement 1, the host galaxy of SN 2008D, and measure¬ 
ment 2, the host galaxy of SN 2006fo). The flux values 

24 https://github.com/nyusngroup/pyMCZ/blob/master/ 
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D02 Cumulative Distribution D02 Cumulative Distribution D02 Cumulative Distribution 




KD02comb Cumulative Distribution 



KD02comb Cumulative Distribution KD02comb Cumulative Distribution 



Fig. 5.— C umulative plot s of the distribution of m etallicity values for the D02 (|Denicolo et al.|2002| and KD02 calibrator (|Kewley &| 
Dopita 2002| as updated by|Kewley & Ellison 2008]), chosen here just as examples, w here x indicates 12 + log 10 (O/Ff). The input data 
is "example data 1”, the emission line values tor the host galaxy of SN 2008D from |Modjaz et al. | ([20 11). This plot provides a visual 
diagnostic of sample completeness. In each plot the cumulative distribution of metallicity values is shown tor randomly chosen subsamples 
of 10%, of 25%, 50%, and 75% of the data, and for all data in the distribution. In the left top and bottom plots, the distributions are 
generated from an N = 200 sample, in the center plots from an N = 2, 000, and in the right-most column from aniV = 20, 000 samples. The 
increasing overlap of the distributions reflects increasing completeness. In the left plots, the distributions do not fully overlap, indicating 
that completeness is not achieved with the N = 200 sample. On the other end, since all subsamples are indistinguishable in the rightmost 
top and bottom plots, from an N = 20, 000 sample, we conclude that completeness is already achieved at the smallest subsample in the 
plots on the right, 10% of the N = 20, 000 sample, for our example data. That is, TV = 2, 000 is a sufficiently large sample for these data, 
and these diagnostics. 


and their associated errors are shown in ITablc 21 The 
code performs simple algebraic operations on large data 
arrays. It is vectorized along the sample-size dimension, 
so that the code loops over the smaller dimension cor¬ 
responding to the number of sets of line measurements 
in input (two in our example data), while operations are 
generally performed on the TV-dimensional vectors stor¬ 
ing the synthetic measurements, and the TV-dimensional 
variables derived from them (certain calibrators, such as 
KD02 and KK04, derive the ionization parameter in an 
iterative fashion, and require further loops). 

With full graphic output (all histograms are plotted) 
and performing all default metallicity calculations, ex¬ 
cept those of D13 ( pyqz ), for our example datasets and a 
sample size TV = 2,000 the time required by the code is 
~ 9 seconds (wall clock time, and less than half a second 
longer in total CPU time, as the machine we tested on is 
dual-core), and less than 0.3 seconds of CPU time spent 
in the kernel within the process. Including the calibrators 
of D13 pyqz for the same datasets, the run time becomes 
~ 19 seconds of clock time (and actual CPU time). The 
code time was tested on sets of 1, 2, 5, 10, 25, 50, and 100 
identical measurements (copies of the emission line fluxes 


of the SN 2008D host galaxy in our example dataset) and 
the clock time is found to scale roughly quadratically 
with the number m of measurement in input, but with 
a small quadratic coefficient: t ~ 7m + 0.07m 2 . This 
means effectively that the 100 measurements sample will 
take 200 times longer than the 1 measurement sample (of 
course with dependence on which line values are available 
in each measurement, enabling different metallicity cali¬ 
brators to be calculated). For the CPU time spent in the 
kernel we find a roughly linear relation with the number 
of input measurements, with a very shallow dependency: 
t ~ 0.05m. This is the actual computational time in 
calculations performed in the code: most clock time is 
in fact spent in root finding, input-output, and plotting 
activities. 

The time spent on plotting functions, which includes 
the calculation of the appropriate number of bins for each 
calibrator, is substantial: 1.67 seconds per distribution 
on average with 14 calls for this dataset (one for each 
metallicity calibrator, E(B-V), and i? 2 3 ). We summarize 
the run time spent on each metallicity calibrator and 
memory usage for the host galaxy of SN 2008D, t he first 
measurement of the sample dataset, in|Figure 6| While 
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the CPU usage is modest, the memory usage can be large, 
depending on the size of the sample. The memory usage 
ramps up quickly, as soon as the N line samples are cre- 
ated, and remains fairly constant throughout the run. 
|Figurc 6| shows the memory used by each function in the 
code as a function of time for a single set of input lines 
of our example data (the host galaxy of SN 2008D). The 
calibrators that take longer are those that require root 
finding (e.g. M08), or optimizations (which are done it¬ 
eratively, e.g. KD02_N2O2). 



Fig. 6.— Memory usage: we plot the square root of the memory 
usage in Megabytes as a function of time for running our code (us¬ 
ing TV=2,000 and all default metallicity calibrators except the D13 
pyqz one s, and PP04) on a single set of measured emission lines 
(|Table 2| host galaxy of SN 2008D). The square root is plotted, 
instead of the natural value, to enhance visibility. Three inserts 
show the regions where most of the metallicity calibrators are cal¬ 
culated, zoomed in, since the run time of the code is dominated 
by plotting routines, including the calculation of the bin size with 
Knuth’s rule. Each function call is represented by an opening and 
closing bracket in the main plot, and by a shaded rectangle in the 
zoomed-in insets. The calculation of N202, which requires 0.25 
seconds, is split between two insets, as well as the calculation of 
the M08 calibrators (three calibrators) which require 0.65 seconds. 
Altogether, the calculation of metallicity in all default calibrators, 
except the D13 pyqz calibrators, takes ~ 1 seconds. For technical 
details about the D13 pyqz calibrators performance we refer the 
reader to the pyqz package. 


The code is trivially parallelizable, as each measure¬ 
ment can be computed on a different core, and multi¬ 
processing is enabled via the multiprocessing Python 
module. When multiprocessing is enabled, pyMCZit uses 
up to n— 1 cores, where n is the total number of available 
cores to the user, or less, if a smaller maximum number 
of cores is set by the user via the global variable MAX- 
PROCESSES. 

5. COMPARISON TO PRIOR UNCERTAINTY 
COMPUTATION AND OTHER WORKS 


SN host sample for our comparison, since investigations 
of the environments of extra-galactic transients, where 
observational errors dominate, are the studies that can 
benefit most from our implementation of metallicity cal¬ 
culations. We compare our resu lts with the results that 
appeared in Modjaz et al. 2TTTT1 since for those results we 
have both all flux values in input and their errors, and 
the metallicities and uncertainties obtained in output. 
We note that this work includes only three calibrators: 
M91, PP04_O3N2, and KD02comb; however these cali¬ 
brators cover the case of calibrators that are fully linear, 
that rely on decision trees, and mixed calibrators. We 
stress again that the PP04 calibrators are superseded by 
the M13 ones, but they can be used in this comparison 
nonetheless. Since our work is the implementation of 
existing calculations, it is beyond the scope of this pa¬ 
per to include a complete comparison of all calibrators: 
that burden was on the original authors and we refer the 
reader to the papers that present each method for that. 

A previous method for determining the uncertainty in 


the oxygen abundance, used in Modjaz et al. 


well as, for example in Modjaz et al.|2008| jKewley et al.| 


2011 


as 


|2010| |Rupke et al.[20ld| was an analytic approach, where 

the emission-line rlux uncertainties are propagated into 
a measurement “envelope”: it found the extreme abun¬ 
dances obtained from the input fluxes minimizing (maxi¬ 
mizing) the indicators used in the calculations by adding 
(subtracting) to the measured line flux values their un¬ 
certainties, thus producing envelope values for the metal¬ 
licity. For comparison, we computed the metallicities 
and their errors in both ways (analytically and using our 
current MC sampling method) for three represent ative 
calibr ators. We plot our results and the residuals in Fig-| 
|ure 7[ The errorbars in the residuals are generated by 
adding asymmetric errors in quadrature: residual m ; n = 

\/^max + 2/min and residual max = Vx^ in + y^ ax . 

This figure highlights a number of important points: 

• The metallicity reported as the 50 t/l percentile of 
the metallicity parameter distribution from the MC 
sampling method is consistent with the analytically 
derived metallicity. The mean of the residuals is 

-2xl0" 3 , ~ 5xl0" 3 , and ~ 5xl0" 2 , for M91, 

PP04_O3N2, and KD02_comb respectively, for 20, 
32, and 35 valid measurement from the |Modjaz| 


et al. (20111 data; the standard deviation of the 
residuals is ~ 0.018, 0.005, 0.004 respectively - 
well within the respective error bars - and thus, the 
published results still stand (unsurprisingly, since 
our code, aside for the calculation of the confidence 
interval, uses for these calibrators the same algo¬ 
rithms developed for IDLKD02). 


• The MC sampling method has smaller error bars 
than the analytic method. This is easily un¬ 
derstandable, since the analytic method assumes 
the worst-case-scenarios, as it basically yields two 
metallicity parameter draws (the “minimum” and 
“maximum”) which are in the tail of the full metal¬ 
licity probability distribution. The MC sampling 
method empirically characterizes the full parame¬ 
ter estimation distribution. 


We compare our method with the results that appear 
in |Modjaz et al. 2011 for 19 SN galaxies. We choose a 


The (minimal and statistically not significant) differ¬ 
ences between our and the IDLKD02 implementation of 
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the KD02_comb calibrator arise from numerical differ¬ 
ences in the solutions of the root finding algorithms, and 
in the optimizations, which we carry out until conver¬ 
gence, while IDLKD02 estimated three iterations would 
be sufficient for minimization. The differences in the M91 
and PP04_O3O2 calibrators are only due to the shape of 
the distributions. 



Fig. 7. — Comparison of metallicity estimation between the ana¬ 
lytic method and our MC sampling method (top) and their resid¬ 
uals (bottom) for three different metallicity calibrators. Fl ux mea¬ 
surements c ome from 19 galaxies previously measured in |Modjaz] 
|et al. |(f2011). Note that for KD02_comb and M91, and in most cases 
for FP04_O3N2 as well, the MC sampling error bars are smaller 
than those of the analytic propagation, which assumes worst-case 
scenarios. The metallicity value derived by our code, which is the 
50th percentile of the metallicity distribution from MC sampling, 
is consistent with the analytically derived metallicity in all calibra¬ 
tors, for all measurements. 


5.1. Comparison with other works 

The field of SN host metallicity studies is rapidly de¬ 
veloping, and these studies are crucial avenues for con¬ 
straining the progenitor systems of different kinds of ex¬ 
plosions. However, some the works in this field do not 
compute errors in the derived metallicities, and others do 
not show how they compute their statistical errors (e.g., 


Anderson et al.|2010l ILeloudas et al.|2011||Sanders et al. 


2012; Leloudas et al. 2014|). 

m contrast, the general metallicity field has considered 
in detail how to estimate the uncertainties in measured 
metallicities. However, when the codes are designed 
for local Galaxy metallicity measurements, they neglect 
the observational uncertainties, since they are negligi¬ 
ble compared to other sources of error. Other codes are 
not open-source, and many of them are for spe cific cal¬ 
ibrators. For example: Moustakas et al. (20101 also use 
MC sampling to estimate the metallicity uncertainties 
(in their case using iV=500 trials and assuming a Gaus¬ 
sian distribution for metallicity parameter distribution) 
but only do this for two calibrators, KK04 and P05. 

Some strong line methods fit a set of stellar popula¬ 


tion synthesis and photoionization models directly to the 
spectra of HII regions by changing the model parameters. 
For computin g the metallicities of the SDSS star form¬ 
ing galaxies, Tremonti et al. (2004) fit a combination of 
stellar population synthesis models and photoionization 
models to the observed strong emission lines [O II], H/3, 
[O III], Ha, [N II] and [S II] and report the median of 
the metallicity likelihood distribution as the metallicity 
estimate, with the width of the distribution giving the 
1-a (Gaussian) error. Blanc et al. (20151 use Bayesian 
inference to derive the joint and marginalized posterior 
probability density functions for metallicity Z and ion¬ 
ization parameter q given a set of observed line fluxes 
and an input photoionization model. They provide a 
publicly available IDL implementation of their method 
named IZI (inferring metallicities - Z - and ionization 
parameters). 

6. CONCLUSIONS 


We have presented the open-source Python code pyMCZ 
that determines oxygen abundance and its error distribu¬ 
tion from strong emission lines in a total of up to 15 diag¬ 
nostics, 11 default and 4 additional upon request, where 
by diagnostic we intend a suite of metallicity calibrators 
that are published in a single article, and for a total of up 
to 32 different metallicity calibrators (since a number of 
diagnostics produce different metallicity estimates based 
on different input indicators). These estimates are based 
on the original KD02IDL code in KD02, and updated 
in KE08, and expanded to include more recently devel¬ 
oped calibrators, and to generate confidence intervals for 
each calibrator via MC sampling. We supply visualiza¬ 
tion tools that, as part of the default output of pyMCZ, en¬ 
able the user to assess the validity of each derived metal¬ 
licity distribution and understand when line flux mea¬ 
surements may lead to misleading metallicity estimates, 
for example in proximity of the demarcation between the 
upper and lower branch for I? 23 -based methods. pyMCZ 
outputs the full estimated metallicity distribution (on de¬ 
mand), and its Kernel Density, and it offers visualization 
tools to assess the spread of the oxygen abundance in the 
different calibrators. 

The validity of our metallicity measurements and of 
their confidence regions hinges upon generating proba¬ 
bility distributions that properly sample the metallicity 
distribution, given the input parameters and the specific 
metallicity calculation algorithm. Thus, we have devel¬ 
oped metrics that allow the user to ascertain that the 
sample drawn in the simulation is sufficiently large. How¬ 
ever, we note that our code does not take into account 
systematic errors, but only computes the statistical ones. 

This code is open access and we welcome input and fur¬ 
ther development from the community. Since the code is 
open-source, the users can include other metallicity cal¬ 
ibrators and modify any parts of the algorithms, or any 
assumption (e.g. that the line flux errors are Gaussian 
distributed). 

We hope that this open-access code will be helpful for 
the many different fields where gas-plrase metallicities are 
important, and observational uncertainty are important 
in the error budget, including in the emerging field of SN 
and GRB host galaxies. 
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